I. Introduction


Table 1.Sample for 5 randomly chosen countries of the data set used in this study
Country agricultural_land_p_2016 food_index_2015 forest_area_p_2015
Nigeria 77.73642 125.77 7.6781185
Morocco 68.54470 139.01 12.6193144
South Africa 79.83002 122.05 7.6177365
Djibouti 73.42537 132.78 0.2415876
Gambia, The 59.78261 101.24 48.2213439
Country livestock_index_2015 population_growth_p_2015 aded_val_GDP_2015
Nigeria 118.26 2.647419 20.631894
Morocco 149.56 1.368838 12.628068
South Africa 131.92 1.528926 2.088751
Djibouti 133.75 1.687809 1.148304
Gambia, The 110.77 3.008474 22.208886

II. Exploratory data analysis


Table 2: Summary for the percent of agricultural land in different countries, in 2016
n min median mean max sd
183 0.5576923 39.65613 38.7442 82.55971 21.83368
Figure 1. Distribution for the percent of agricultural land in different countries, in 2016

Figure 1. Distribution for the percent of agricultural land in different countries, in 2016

Figure 2. Distribution for the 2015 food production index for different countries

Figure 2. Distribution for the 2015 food production index for different countries

Figure 7.1. Interactive Scatterplot for the percent of agricultural land in different countries, in 2016 against their 2015 food production index. The red line is the best fit line. The blue curve is the Loess curve.

Figure 3. Distribution for the percent of forest area in different countries, in 2015

Figure 3. Distribution for the percent of forest area in different countries, in 2015

Figure 7.1. Interactive Scatterplot for the percent of agricultural land in different countries, in 2016 against their percent of forest area, in 2015. The red line is the best fit line. The blue curve is the Loess curve.

Figure 3. Distribution for the Livestock production index in 2015

Figure 3. Distribution for the Livestock production index in 2015

Figure 7.1. Interactive Scatterplot for the percent of agricultural land in different countries, in 2016 against their livestock production index, in 2015. The red line is the best fit line. The blue curve is the Loess curve.

Figure 3. Distribution for the percent annual population growth for different countries in 2015.

Figure 3. Distribution for the percent annual population growth for different countries in 2015.

Figure 7.1. Interactive Scatterplot for the percent of agricultural land in different countries, in 2016 against their percent annual population growth in 2015. The red line is the best fit line. The blue curve is the Loess curve.

Figure 3. Distribution for the Added value of Agriculture, forestry, and fishing to the GDP of different countries, in 2015

Figure 3. Distribution for the Added value of Agriculture, forestry, and fishing to the GDP of different countries, in 2015

Figure 7.1. Interactive Scatterplot for the percent of agricultural land in different countries, in 2016 against the added value of Agriculture, forestry, and fishing to their GDP in 2015. The red line is the best fit line. The blue curve is the Loess curve.


III. Multiple linear regression

i. Methods


## 
## Call:
## lm(formula = agricultural_land_p_2016 ~ ns(food_index_2015, df = 4) + 
##     forest_area_p_2015 + livestock_index_2015 + ns(population_growth_p_2015, 
##     df = 4) + ns(aded_val_GDP_2015, df = 4), data = tidy_joined_dataset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -53.506 -11.607  -0.709  12.992  37.508 
## 
## Coefficients:
##                                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                            28.37655   19.90532   1.426 0.155846    
## ns(food_index_2015, df = 4)1           19.45108   13.01581   1.494 0.136942    
## ns(food_index_2015, df = 4)2           16.96892   10.44704   1.624 0.106191    
## ns(food_index_2015, df = 4)3           30.44086   30.11393   1.011 0.313539    
## ns(food_index_2015, df = 4)4            1.13055   12.60260   0.090 0.928627    
## forest_area_p_2015                     -0.42675    0.06212  -6.869  1.2e-10 ***
## livestock_index_2015                   -0.04713    0.05606  -0.841 0.401732    
## ns(population_growth_p_2015, df = 4)1  -6.12866   11.02712  -0.556 0.579100    
## ns(population_growth_p_2015, df = 4)2   9.32839   10.42530   0.895 0.372183    
## ns(population_growth_p_2015, df = 4)3 -38.11209   25.48425  -1.496 0.136656    
## ns(population_growth_p_2015, df = 4)4 -53.30382   14.21244  -3.751 0.000243 ***
## ns(aded_val_GDP_2015, df = 4)1         15.39014    6.21166   2.478 0.014215 *  
## ns(aded_val_GDP_2015, df = 4)2         11.90038    8.95583   1.329 0.185720    
## ns(aded_val_GDP_2015, df = 4)3         47.12623   14.67307   3.212 0.001581 ** 
## ns(aded_val_GDP_2015, df = 4)4         10.92609   14.45618   0.756 0.450823    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.58 on 168 degrees of freedom
## Multiple R-squared:  0.3315, Adjusted R-squared:  0.2758 
## F-statistic:  5.95 on 14 and 168 DF,  p-value: 1.957e-09
Figure 14. Normal Q-Qplot for the percent of agricultural land in different countries, in 2016

Figure 14. Normal Q-Qplot for the percent of agricultural land in different countries, in 2016

Figure 15. Residuals distribution for the statistical model

Figure 15. Residuals distribution for the statistical model

Figure 16. Residuals graph for the fitted values, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 16. Residuals graph for the fitted values, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 17. Residuals graph for the food production Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 17. Residuals graph for the food production Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the percent of forest area in different countries, in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the percent of forest area in different countries, in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Livestock production index in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Livestock production index in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the percent annual population growth for different countries in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the percent annual population growth for different countries in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Added value of Agriculture, forestry, and fishing to the GDP of different countries, in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Added value of Agriculture, forestry, and fishing to the GDP of different countries, in 2015, with a Lowess curve in blue and a horizontal line at zero in red.

Table 3: VIF table
GVIF Df GVIF^(1/(2*Df))
ns(food_index_2015, df = 4) 2.464925 4 1.119375
forest_area_p_2015 1.109047 1 1.053113
livestock_index_2015 1.762298 1 1.327516
ns(population_growth_p_2015, df = 4) 2.053750 4 1.094129
ns(aded_val_GDP_2015, df = 4) 2.119023 4 1.098416

ii. Model Results and Interpretation


## lm(formula = agricultural_land_p_2016 ~ ns(food_index_2015, df = 4) + 
##     forest_area_p_2015 + livestock_index_2015 + ns(population_growth_p_2015, 
##     df = 4) + ns(aded_val_GDP_2015, df = 4), data = tidy_joined_dataset)
Table 4. Model Summary Table
Estimate Std. Error t value Pr(>|t|)
(Intercept) 28.37655 19.90532 1.42558 0.15585
ns(food_index_2015, df = 4)1 19.45108 13.01581 1.49442 0.13694
ns(food_index_2015, df = 4)2 16.96892 10.44704 1.62428 0.10619
ns(food_index_2015, df = 4)3 30.44086 30.11393 1.01086 0.31354
ns(food_index_2015, df = 4)4 1.13055 12.60260 0.08971 0.92863
forest_area_p_2015 -0.42675 0.06212 -6.86937 0.00000
livestock_index_2015 -0.04713 0.05606 -0.84066 0.40173
ns(population_growth_p_2015, df = 4)1 -6.12866 11.02712 -0.55578 0.57910
ns(population_growth_p_2015, df = 4)2 9.32839 10.42530 0.89478 0.37218
ns(population_growth_p_2015, df = 4)3 -38.11209 25.48425 -1.49552 0.13666
ns(population_growth_p_2015, df = 4)4 -53.30382 14.21244 -3.75050 0.00024
ns(aded_val_GDP_2015, df = 4)1 15.39014 6.21166 2.47762 0.01422
ns(aded_val_GDP_2015, df = 4)2 11.90038 8.95583 1.32879 0.18572
ns(aded_val_GDP_2015, df = 4)3 47.12623 14.67307 3.21175 0.00158
ns(aded_val_GDP_2015, df = 4)4 10.92609 14.45618 0.75581 0.45082
Value df
Residual Standard Error 18.581 168
Multiple R-squared 0.331
Adjusted R-squared 0.276
Value Numerator df Denominator df
Model F-statistic 5.95 14 168
P-value 1.957e-09

iii. Inference for multiple regression

Table 5. ANOVA Table
Df Sum Sq Mean Sq F value Pr(>F)
ns(food_index_2015, df = 4) 4 4407.1018 1101.7755 3.1912 0.0148
forest_area_p_2015 1 13145.4035 13145.4035 38.0750 0.0000
livestock_index_2015 1 478.1623 478.1623 1.3850 0.2409
ns(population_growth_p_2015, df = 4) 4 6474.3214 1618.5803 4.6881 0.0013
ns(aded_val_GDP_2015, df = 4) 4 4254.1402 1063.5350 3.0805 0.0176
Residuals 168 58002.0458 345.2503 NA NA

IV. Discussion

i. Conclusions

ii. Limitations

iii. Further questions


V. Citations and References